Disaster Recovery

Antei maintains robust disaster recovery (DR) practices to ensure service continuity, data integrity, and rapid recovery following any disruption or incident.


Recovery Objectives

MetricTarget
Recovery Time Objective (RTO)≤ 2 hours for core services
Recovery Point Objective (RPO)≤ 15 minutes of data loss window

Backup Strategy

  • PostgreSQL Backups (GCP)
    • Automated point-in-time backups every 15 minutes
    • Daily full snapshots stored for 30 days
  • Cloudflare R2 Objects
    • Versioned storage for all key documents (invoices, attachments)
    • Lifecycle policy to retain 90 days of object versions
  • Xano Metadata & Logs
    • Daily exports of audit logs and configuration stored in R2
    • Retention for 180 days

Failover & Continuity

  • Multi-Region Read Replicas
    • PostgreSQL read replicas in secondary GCP regions for failover
  • Worker Redeployment
    • Cloudflare Workers automatically redeployed across edge nodes
  • API Layer Resilience
    • Xano deployed on multiple GCP zones; automatic traffic rerouting on failure
  • Auxiliary Service Redundancy
    • Railway and Render services configured with health checks and retry policies

Incident Response Process

  1. Detection & Alerting
    • Automated monitoring triggers alerts for service errors, latency spikes, and downtime
    • Incident tickets created in PagerDuty (or equivalent)
  2. Containment & Mitigation
    • Traffic rerouted to healthy regions or fallback endpoints
    • Read-only mode activated if necessary to preserve data integrity
  3. Recovery & Restoration
    • Data restored from nearest snapshot to meet RPO
    • Services restarted in failover region within RTO targets
  4. Post-Incident Review
    • Root cause analysis documented
    • Action items tracked and prioritized in backlog
    • DR plan updated based on lessons learned

Testing & Validation

  • Quarterly DR Drills
    • Simulated outages to validate failover procedures and recovery scripts
  • Backup Restore Tests
    • Monthly restore exercises from R2 and PostgreSQL snapshots
  • Documentation Reviews
    • DR plan reviewed semi-annually to incorporate infrastructure or process changes

Next Steps